Using The Matrix Ridge Approximation to Speedup Determinantal Point Processes Sampling Algorithms

نویسندگان

  • Shusen Wang
  • Chao Zhang
  • Hui Qian
  • Zhihua Zhang
چکیده

Determinantal point process (DPP) is an important probabilistic model that has extensive applications in artificial intelligence. The exact sampling algorithm of DPP requires the full eigenvalue decomposition of the kernel matrix which has high time and space complexities. This prohibits the applications of DPP from large-scale datasets. Previous work has applied the Nyström method to speedup the sampling algorithm of DPP, and error bounds have been established for the approximation. In this paper we employ the matrix ridge approximation (MRA) to speedup the sampling algorithm of DPP, showing that our approach MRA-DPP has stronger error bound than the Nyström-DPP. In certain circumstances our MRA-DPP is provably exact, whereas the Nyström-DPP is far from the ground truth. Finally, experiments on several real-world datasets show that our MRA-DPP is more accurate than the other approximation approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exact Sampling from Determinantal Point Processes

Determinantal point processes (DPPs) are an important concept in random matrix theory and combinatorics. They have also recently attracted interest in the study of numerical methods for machine learning, as they offer an elegant “missing link” between independent Monte Carlo sampling and deterministic evaluation on regular grids, applicable to a general set of spaces. This is helpful whenever a...

متن کامل

Kronecker Determinantal Point Processes

Determinantal Point Processes (DPPs) are probabilistic models over all subsets a ground set of N items. They have recently gained prominence in several applications that rely on “diverse” subsets. However, their applicability to large problems is still limited due to theO(N) complexity of core tasks such as sampling and learning. We enable efficient sampling and learning for DPPs by introducing...

متن کامل

Diverse Landmark Sampling from Determinantal Point Processes for Scalable Manifold Learning

High computational costs of manifold learning prohibit its application for large point sets. A common strategy to overcome this problem is to perform dimensionality reduction on selected landmarks and to successively embed the entire dataset with the Nyström method. The two main challenges that arise are: (i) the landmarks selected in non-Euclidean geometries must result in a low reconstruction...

متن کامل

Notes on using Determinantal Point Processes for Clustering with Applications to Text Clustering

In this paper, we compare three initialization schemes for the KMEANS clustering algorithm: 1) random initialization (KMEANSRAND), 2) KMEANS++, and 3) KMEANSD++. Both KMEANSRAND and KMEANS++ have a major that the value of k needs to be set by the user of the algorithms. (Kang 2013) recently proposed a novel use of determinantal point processes for sampling the initial centroids for the KMEANS a...

متن کامل

Nystrom Approximation for Large-Scale Determinantal Processes

Determinantal point processes (DPPs) are appealing models for subset selection problems where diversity is desired. They offer surprisingly efficient inference, including sampling in O(N) time and O(N) space, where N is the number of base items. However, in some applications, N may grow so large that sampling from a DPP becomes computationally infeasible. This is especially true in settings whe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014